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Multivariate high-frequency financial data via semi-Markov processes 
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In this paper we propose a bivariate generalization of a weighted indexed semi-Markov chains to study 
the high frequency price dynamics of traded stocks. We assume that financial returns are described by a 
weighted indexed semi-Markov chain model. We show, through Monte Carlo simulations, that the model 
is able to reproduce important stylized facts of financial time series like the persistence of volatility and at 
the same time it can reproduce the correlation between stocks. The model is applied to data from Italian 
stock market from 1 January 2007 until the end of December 2010. 

Keywords: Financial market; semi-Markov chain; Bivariate processes 
1. Introduction 

Semi-Markov processes (SMP) are a wide class of stochastic processes which generalize at the same 
time both Markov chains and renewal processes. The main advantage of SMP is that they allow the use 
of whatever type of waiting time distribution for modeling the time to have a transition from one state 
to another one. On the contrary, Markovian models have constraints on the distribution of the waiting 
times in the states which should be necessarily represented by memory-less distributions (exponential 
or geometric for continuous and discrete time cases respectively). This major flexibility has a price to 
pay: the parameters to be estimated are more numerous. 

In three recent papers D' Amico and Petroni (201 1, 2012a, 2012b) we showed that returns of stocks from 
financial market are better represented by semi-Markov processes than by a simple Markov chain. In 
particular, we have showed that volatility clustering, one of the main stylized fact of financial market, 
is almost exactly reproduced by an indexed semi-Markov process. More important, in our models, 
the volatility autocorrelation is obtained endogenously without introducing external or latent auxiliary 
stochastic processes. To improve further our previous results, in this work, we propose a bivariate model 
where the indexed semi-markov process is used to reproduce simultaneously two stocks. 

The database used for the analysis is made of high frequency tick-by-tick price data from 20 stocks 
in Italian market from the first of January 2007 until end of December 2010. From prices we then define 
returns at one minute frequency. 



2. The Weighted-Indexed Semi-Markov Model 

In this section we describe the weighted-indexed semi-Markov model that is able to represent higher- 
order dependencies between successive observations of a state variable. One way to increase the mem- 
ory of the process is by using high-order semi-Markov processes as defined in Limnios and Oprisan 
(2003) and more recently revisited and extended in a discrete time framework in D' Amico et al. (2012). 
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A more parsimonious model, named indexed semi-Markov chain (ISMC) model, has been defined by 
D' Amico and Petroni (201 1), and there, it is showed that it describes appropriately important empirical 
regularities of financial time series such as the first passage time distribution and the autocorrelation 
function. A further improvement of the ISMC model was proposed in D'Amico and Petroni (2012b) 
named Weighted-Indexed Semi-Markov Chain (WISMC) model which allows the possibility of repro- 
ducing long-term dependence in the stock returns in a very efficient way. 

Let us describe briefly the WISMC model. We assume that the value of the financial asset under 
study is described by the time varying asset price S(t). The return at time t calculated over a time in- 
terval of length 1 is defined as $/a • The return process changes value in time, then we denote by 
{-A;}«eiN the stochastic process with finite state space E = {1,2,..., s} and describing the value of the 
return process at the n-th change. 

Let us consider the stochastic process {T n } ne ^ with values in IN. The random variable T„ describes 
the time in which the n-th change of the return process occurs. 

Let us consider also the stochastic process {U^}„ e !N with values in IR. The random variable U„ 
describes the value of the index process at the n-th transition. 

In reference D'Amico (201 1), the process {U„} was defined as a reward accumulation process linked 
to the Markov Renewal Process {J„,T n }; in D'Amico and Petroni (201 1) the process {U n } was defined 
as a moving average of the reward process. In D'Amico and Petroni (2012b) motivated by the applica- 
tion to financial returns, a more flexible index process was defined: 

Un=Y^ L f{Jn-i-k,a,X), (2.1) 

k=0a=T„_ l _ k 

where / :£xINxIR^IRisa Borel measurable bounded function and Uq is known and non-random. 

The process can be interpreted as an accumulated reward process with the function f as a mea- 
sure of the weighted rate of reward per unit time. The function / depends on the current time a, on the 
state J n -\-k visited at current time and on the parameter X that represents the weight. 
The application of the model requires the choice of a specific functional form of /. 

The WISMC model is completely specified once a dependence structure between the variables is 
considered. Toward this end, the following assumption was done: 

P[/„+i = j, T n+l - T n t\o(J h ,T h ,u£),h = 0,...,n,J n = i,U n l = v] 
= F[J n+1 = j, T„ +l - T n sc t\J n = i,U* = v] := Qj(v;f), 

where a^^T^Ut), h ^ n is the natural filtration of the three-variate process. 

The matrix of functions Q (v;t) = (fiy(v;f ))/je£ is called weighted-indexed semi-Markov kernel. 

The joint process [J n , T„ ) depends on the process , the latter acts as a stochastic index. Moreover, 
the index process depends on (J n ,T n ) through the functional relationship (|2.1[) . 
Observe that if 

Wn+i = J, T n+l - T n < t\J„ = i,U% = v] = P[/„+i = 7, r„+i - T n < t\J„ = i] 

for all values v e IR of the index process, then the WISMC kernel degenerates in an ordinary semi- 
Markov kernel and the WISMC model becomes equivalent to classical semi-Markov chain model. 
The triple of processes {J n ,T n ^U^} describes the behavior of the system only in correspondence 
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of the transition times T n . To describe the behavior of our model at whatever time t which can be a 
transition time or a waiting time, we need to define additional stochastic processes. 

Given the three-dimensional process {J n ,T ni U x } and the weighted indexed semi-Markov kernel 
Q A ( v; t ) , we define by 

N(t) = sup{« G N : T„ 
Z(t) =J N (,y, 

N(t)-i+e ('^T N(l)+e _ k )-i (2-3) 
U x (t)= £ £ f(J m+e ^ k ,a,X), 

k=o "=T N (,) + e-i-k 

where 6 = l {f>%(() }. 

The stochastic processes defined in (|2.3p represent the number of transitions up to time t, the state 
of the system (price return) at time t and the value of the index process (weighted moving average of 
function of price return) up to f, respectively. We refer to Z(f) as a weighted indexed semi-Markov 
process. 

The process U x {t) is a generalization of the process U„ where time t can be a transition or a waiting 
time. It is simple to realize that if t = T n we have that U x (t) = U„. 
Let 

pfj(v):=nJ n+l =j\J n = i,U x =v}, 

be the transition probabilities of the embedded indexed Markov chain. It denotes the probability that the 
next transition is in state j given that at current time the process entered in state i and the index process 
is equal to v. It is simple to realize that 

^.(v) = lime^(v;0. (2.4) 
Let (v; •) be the sojourn time cumulative distribution in state i G E: 

Hf(v-t) := P[r„ +1 - T„ ^t\J n = i, U X = v] = £ Q?j(v,t). (2.5) 

It expresses the probability to make a transition from state i with sojourn time less or equal to t given 
the indexed process is v. 

The conditional waiting time distribution function G expresses the following probability: 

G?j(v;t) := P[T„ +l -T n ^t\ J„ = i,/„+i = j,U X = v\. (2.6) 

It is simple to establish that 



lf ^ (V '^° (2.7) 
1 if^(v)=0. 

In the papers D'Amico (2011) and D'Amico and Petroni (2012b) explicit renewal-type equations 
were given to describe the probabilistic behavior of the ISMC model. Similar results could be derived 
for the WISMC model but here we prefer to derive others results which are of strict relevance to the 
multivariate model presented in next section. 
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As it is well known, it is possible to give an alternative description of the semi-Markov process 
by introducing the backward recurrence time process B(t ) := t — TffM and to describe the probabilistic 
behavior of the Markov process (Z(f ),B(t )) on the extended state space E x IN where IN = {0, 1,...,N} 
and N is the maximum length of stay of the states of the process. This technique was first proposed in 
Vassiliou and Papadopoulou (1992) and proved useful is studying certain aspects of non-homogeneous 
semi-Markov process. Also in our more general setting it is possible to describe the system behavior 
by using the backward recurrence time process, this choice is adopted here to have a description of the 
one-step transition probabilities of the WISMC model and result to be very useful in the next section for 
the definition of the bivariate model. 

Let denote by 

P((i,u)(j,d)) (v) := P[Z(« + 1) = j,B(n + 1) = d \ Z(n) = i,B(n) = u, U X (n)=v]. (2.8) 



The probabilities (J278J) can be obtained from the indexed semi-Markov kernel, to proove this, we 
first need to give the following 

Lemma 2.1 Let suppose that U^(n) = v, T N ^ =n — u and T N ^ +1 > n, then 

n-l N(n)-\ J]v(/i)-i' _1 

U N(n)= v - L f( J N(n),n-a,X)+ £ £ Af(J N[n) _ k ,T N{n) ,n,a) (2.9) 



k=0 a=T, 



N(n)-l-k 



where Af(i,T N ^,n,a) := f(i,T N ^ — a, X) —f{i,n — a, X). Proof. Let consider the quantity E^/ n \ 
U^(n). Since T N / n \ = n — u and 7W„\ + i > «, the time n is a waiting time and consequently U^(n) 

E*=0 f(Mn)-k> n ~ a ^)- Then 



N{n)-l T N(n y k -l N{n)(n^T N{n} _ k+l )-l 

J N(n) 



U N(n)~ U ( n ) = H f( J N(n)-l-k^ T N(n)- a ^)-^ H f( J N(n)-k, n ~ 



k-0 a=%(„)_i_jfc k-0 <*= T N(n)-k 

N(n)-\ T N („y k -l (nA%(„)_ i+1 )-l ^-1 

= Y, ( H f{ J N(n)-\-kJN(n)-a,X)- £ f(J N („y k ,n~a,X)) - f(Jo,n-a,X), 

k=0 a=T N („yi_ k a=T N ^_ k a=T 

and by considering that Af(i, T N ^,n,a) :— f(i, T N ^ — a, X) — f(i,n — a, X) and U^(n) = v by substi- 
tution we recover formula (12.911 □ 



Theorem 2.1 For all i,j 6 E, u,d ElN and v E R, the one step transition probabilities P((i }U )(j,d)){ v ) '■- 
P[Z(n + l) = j,B(n+l)=d\ Z{n) = i,B{n) = u,U l («) = v] are given by 

H^{v+AU{N{n),n);\+u) . f d=l+U 
_ I ffH V +AV{N{n),n);\+u) 3 ' „ im 
/>((«,«)(;,</)) W - \ q f( v +AU{N{n),n);l+u) ...... _ (AiUj 



<j 

Hf-(v+AU(N(n),n)\\+u) 



ifj^i,d = 0. 



where H[-(t) — 1 —H["(t) is the survival function of sojourn time in state i, qfj(x,t) = Qf^x^) 

J N(n 



Qfj(x,t — 1) and AU(N(n),n) — U^,s — U x is the opposit of the variation of the index process on 
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the waiting time n — N(n). Proof. Being the events {7]v(n)+i = ^} disjoint it follows that: 

W[Z(n + 1) = j,B(n + l)=d\ Z(n) = i,B(n) = u,U l (n) = v] 

=7' 5 7 JV(n+l) = « + 1 - d , T N(n)+l >n + l\ /#(„) = iJ N ( n ) = n-u,T N ^ +1 > n,U X {n) = v] 

+ P[^V(n+l) =j, T N{n+l) =n + l -d,T N ^ +1 | /jv(„) = i, 3jv( ra ) = n - K, 2]v(„)+l > n,U X (n) = v] 

(2.11) 

If we represent £^Jy„n = t^' 1 («) + AU(N(n),n), the first addend on the r.h.s. of (|2.11|) becomes 

P[Af(n+l) =J)3jV(n+l) =«+l-^! 7 Sv(n)+l >»+l | = «, ?JV(n) = » ~ "> ZjV(n)+l >n,U X (n) = v] 

= ^[^V(n+i) = J,7jv(«+1) = n + l- <*,7]y( n ) + i > «+l | =i,T N ( n ) = n-uJ N{n)+l > n,U^ {n) = v+AU (N(n),n)] 
P[Jn(„+i) =j,T N (n+l) =n+l-d,T N („ )+ \ > n+\J N{n)+l > n \ J N{n) =iJ N ( n) = n - u,U X {n) = v + AU (N(n),n)] 
p Kv(„)+i > n | J N{n) = i,T N(n) =n-u,U x [n) =v+AU(N(n),n)] 

(2.12) 

The denominator of (|2 . 1 2|) can be computed as follows: 

= PKv( n )+i -T N ( n ) >n-(n-u)\ J N{n) =i,T N ( n) =n-u,U X {n) = v + AU(N(n),n)} 

= P[T N(n)+l - T N(n) > u | J N{H) = iJ N(n) = n~u,U x {n) =v + AU(N(n),n)] = l-H^(v+AU(N(n),n);u) 

(2.13) 

where the last equality is obtained using (|2.5[) . 

The numerator of (|2.12p can be evaluated as follows: 

= Wfr(n+i) =j, T N(,,+i) =n + l-d,T N(n)+l >n + l\ J N ( n} =i,T N (n) = n - u , U N(n) = v + AU(N(n),n)] 
= WN(n+i) =j,T N ( n +i) =n + l-d \ 7V(„)+i >n+\,J N(n) = i,T N{n) = n-u,ufc (n) = v + AU(N(n),n)] 
■W[T N (n)+i >n+l \J N{n )=i,T N („) =n-u,U X (n) =v + AU(N(n),n)} 

(2.14) 

Now note that if 7jv(h)+i > n + 1 then N(n + 1) = N(n) which implies that 7jy(»+l) = T N (^ i.e. 
m+l—d = n — u which gives d = 1 +u. The equality N(n + 1) =N(n) implies also Jff(n+i) = ^N{n) i- e - 
7 = '■ 

Then (|2. 14[) is equal to 

= l{j=;}l{</=i+w}]PKv(n)+i >«+l \ J N („) =iJN{n) = n — u,U^ n j = v + AU (N(n),n)] 

= l{j=i}l{d=l+u}P[TN(n)+l-TN(n) > n + 1 ~ ( n ~ u ) I J N{n) = iJ N{n) =n-u,U X {n) =v + AU{N{n),n)] 

= l {J=i] l {d=1+u} (l-H?(v + AU(N(n),n);l + u)). 

(2.15) 

Summarizing (|2.12p is given by 

l {j=i] l {d=l+u} (l-H x (v + AU(N(n),n);l+ U )) 
l-H x (v + AU{N{n),n);u) 

It remains to compute the second addend on the r.h.s. of equation (|2.1ip . This probability can be 
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factorized into 

P[/y(n+l) =j, T N(n+i) = n+l-d\ J N („) =iJ N (n)+\ < « = «-")?iV(n)+l > n,U x {n) = v] 
• IP[7}v(»)+i ^ " + 1 I = i, 7n(„) =n-u, T N{n)+1 >n,U X (n) = v] 



(2.17) 



/P[A(n)+l =7 i 7a'(«)+1 =«+l |/jv(») = h T N(n) = = V + ZiI/(JV(n),«)] 

1 { rf =°} V Turn.. . j ; t. I 



»[T N(n)+l =n+\\J N[n) =i,T N(n) =n-u,Uk (n) = v + AU{N{n),n)] ' (2.18) 

•P[3jV(n)+l ^"+1 |A(n) =^ 7 iV(n) = n-u,T N{n)+l >n,U X {n) = v] 
and since c/ = 0, we should have /' 7^ z. Then we get 

^.(v + 4f/(jV(«),H);l+M) 



l{d=0}l{^/} 



[7jv(«)+i =« + l |^iv(„) =i,T N (n) =n-u,U N(n) = v + AU{N{n),n)} 
'KvM+i =«+l \J N (n) =i J N ( n) = n - u,ufj (n) = v + AU (N(n),n)] 
P[?]v(„) + i >n I J N{n) = i,T N{n) =n-u,U^ n) =v + AU(N(n),n)] 



(2.19) 



qfAv + AU(N(n),n);l+u) 

= 1{ ^ 0}1{ ^ } l-H^ + AU(N(n),n)) ' ^ 
A substitution of (|2.16jl and (|2.20[) in p. lip completes the proof. □ 

Remark 2.1 The computation of the probabilities (|2.8[) can be done through formula (|2. 10[) where it 
is necessary to evaluate the quantity AU (N(n),n). This last quantity is obtained thanks to Lemma (|2.ip 
and has to be recalculated step by step. 

3. The Bivariate Weighted-Indexed Semi-Markov Model 

In this section we extend the WISMC model in a multivariate setting. For reasons of simplicity we will 
explain the model only for the bivariate case, the multivariate extension is straightforward. 

Let us assume to dispose of a bivariate series of high-frequency financial data concerning stock re- 
turns. Moreover we assume that each one of the two stocks is modeled via a WISMC model. By J' n , 
r„ ! , U„' andZ'(n) we denote the return at the n-th change, the time of the n-th change, the value of the 
index at the n-th transition and the state of the return at time t for the stock i £ {1,2}, respectively. 

In order to define a bivariate model, it is convenient to introduce the backward recurrence time pro- 
cess for the stock i defined, for each time t 6 IN by B'(t) = t — T N i^, where N'(t) is the counting process 
associated to the stock i. The reason for the introduction of the backward recurrence time process is 
that it complements the semi-Markov process to a Markov process on an extended state of space. This 
simplifies the definition of the bivariate model which can be now conveniently defined in term of the 
triplet (Z'^fi'^t/^f)). 

To define the model we need to formulate three assumptions named in the following Al, A2 and 
A3. Before of stating the assumption we introduce some auxiliary notation. By Z(«) = (Z 1 (n),Z 2 («)), 
B(n) = (B 1 (n),B 2 (n)),V x {n) = (U^ (n),U**(n)), j = U1J2). i = (h,i 2 ), & = (d u dz) andu = ( Ul ,u 2 ). 

ASSUMPTION Al: 

P[Z(n + 1) = j,B(« + 1) = d I a(Z(A),B(A)),0 < h < n,Z(n) = i,B(n) = u] 

P[Z(«+ 1) = j,B(«+ 1) = d | Z(«) =i,B(n) =u,U A («) = v] ' 
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Assumption Al states that the knowledge of (Z(n) = i,B(n) = u,U («) = v) suffices to give the condi- 
tional distribution of the couple (Z(« + l),B(n + 1)) whatever the values of the past variables might be. 
It is simple to realize that: 

P[Z(n + 1) = j,B(«+ 1) = d | Z(n) = i,B(«) = u,U A (n) = v] 

= P[Z 1 (« + 1) = j 1 ,B 1 (n + l)=d 1 Z 2 (n + 1) = j 2 ,B 2 (n + l) =d 2 ,Z(n) =i,B(n) = u,U A («) =v] 
•P[Z 2 (n+l) = j 2 .B 2 (n+l) = d 2 | Z(/i)=i,B(n)=u,U A (n)=v]. 

(3.2) 

To compute p. 21) we need to formulate additional hypotheses: 
ASSUMPTION A2: 

P[Z 2 (« + 1) = j 2 ,B 2 (n+l)=d 2 | Z(«) =i,B(«) =u,U A («) =v]. 

= P[Z 2 (« + 1) = j 2 ,B 2 (n+l) = d 2 \ Z 2 (n) = i 2 ,B 2 (n) = u 2 ,U h (n) = n]- =: P&^XCfcA)^) 

(3.3) 

The assumption A2 affirms that next state of return and next duration of the stock 2 do depend only 
on the same variables at the previous time. This hypothesis can be also considered as a hierarchical 
assumption: the stock 2 is the leading stock evolving with its own dynamics whereas the stock 1 's evo- 
lution depends on that of stock 2. It should be noted that it is possible to invert the hierarchy between 
the two stocks. 

ASSUMPTION A3: 

P[Z l (n + l) = j 1 ,B l (n + l)=d l \Z 2 (n + l) = j 2 ,B 2 (n + 1) = d 2 ,Z(n) = i,B(n) = u,U A (n) = v] 
= P[Z' (« + 1) = ji,B\n + 1) = di | sgn(Z 2 (n + 1)) = s,Z l (n) = i h B l (n) =u u U Xl (n) = Vi] 

(3.4) 

where sgn(Z 2 (n + 1))) is the sign of Z 2 (n + 1) which can assume the values +,0, — according to the 
fact that the stock 2 exhibits a positive, constant or negative return, respectively. This assumption is very 
important as it reduces drastically the dimensionality of the model still preserving the cross correlation 
between the two stocks. 

Summarizing, the assumptions Al, A2 and A3 allow us to compute the joint one step transition 
probability (13.11) of the two stocks with the product 

P(ii,»i)(UiA))( v i*' J Wi 2 ,» 2 )(0! ! A))( V2 )- (3 ' 5) 
The probabilities p 2 ^ M2 vy 2 d 2 ))( v 2) have been evaluated in the previous section where they were 
represented as a function of the weighted-indexed semi-Markov kernel. In the paper D' Amico, Petroni 
and Prattico (20 1 3) a nonparametric estimator of the weighted-indexed semi-Markov kernel was derived. 
From this estimator it is immediate to recover a plug-in estimator of (|2.1 II) . 

The probabilities p,.^ u wr- ( / 1 ))( v i' iS ) can ^ e a ls° evaluated directly from the data. To this end it 

is sufficient to consider the estimator Nl ^p^lJl^ljfzl where L is the lenght of the bivariate series of 

N L (ii,ai,vi;*2) 

stock returns and 

1 2 L 
N L ' (h,Ul,Vi;ji,dl,S2) = £ ^{Zi(t)=j u B 1 (t)=di,sgn(^(t))=S2^ 1 (t-X)=h,B 1 (t-X)=u 1 ,U K l(t-l)=v l } 
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AT 


Atlantia 


MP 


Banca Monte dei Paschi di Siena 


E 


ENI 


EN 


ENEL 


F 


Fiat 


FN 


Finmeccanica 


G 


Generali 


IS 


Intesa San Paolo 


LU 


Luxottica 


MS 


Mediaset 


MB 


Mediobanca 


PC 


Pirelli 


PR 


Prysmian 


SP 


Saipem 


SR 


Snam Rete Gas 


ST 


ST Microelectronics 


TI 


Telecom 


TE 


Tenaris 


TR 


Terna 


UC 


Unicredit 



Table 1 . Stocks used in the application and their symbols 



and 



4. Empirical results 

The model described in the previous sections was applied to a set of 20 stocks from the Italian Stock 
Exchange ("Borsa Italiana"). The list of stocks and their symbols are reported in table Q] The database 
is composed of tick-by-tick quotes recorded form January 2007 to December 2010 (4 full years). The 
data have been re-sampled to have 1 minute frequency. The number of returns analyzed is then roughly 
500* 10 3 for each stock. A better description of the database can be found in D'Amico & Petroni (201 1). 
Returns have been discretized into 5 states chosen to be symmetrical with respect to returns equal zero 
and to keep the shape of the distribution unchanged. Returns are in fact already discretized in real data 
due to the discretization of stock prices which is fixed by each stock exchange and depends on the value 
of the stock. Just to make an example, in the Italian stock market for stocks with value between 5.0001 
and 10 euros the minimum variation is fixed to 0.005 euros (usually called tick). We then tried to remain 
as close as possible to this discretization. 

Following D'Amico & Petroni (2012b) we use as definition of the function / in ( 12.11 ) an exponen- 
tially weighted moving average (EWMA) of the squares of returns which has the following expression: 

)J n -a j2 

f(J n -i-k,a,X) = - — n ~ X ~ k (4.1) 

Ljfc=o^a=r B _i_ t A 
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FIG. 1. Autocorrelation functions of real data (solid line) and synthetic (dashed line) time series for the analyzed stocks. 



and consequently the index process becomes 



1-1 I yT n -aj7. 



Un=Y E r V~* • (4-2) 

i=0a=r„_i_ t \Lk=0 La=r„_ 1 _ jfc A / 



The index f/' 1 was also discretized into 5 states of low, medium low, medium, medium high and high 
volatility. Using these definitions and discretizations we estimated, for each stock, the probabilities 
defined in the previous section by using their estimators directly from real data. By means of Monte 
Carlo simulations we were able to produce, for each of the 20 stocks, a synthetic time series. Each 
time series is a realization of the stochastic process described in the previous section with the same time 
length as real data. Statistical features of these synthetic time series are then compared with the statistical 
features of real data. In particular, we tested our model for the ability to reproduce the autocorrelation 
functions and the cross-correlation betweens stocks. We remind the definition of the autocorrelation 
function: if R indicates returns, the time lagged (t) autocorrelation of the square of returns is defined as 

( ' - Var{R\t)) { ' 

We estimated E(t) for real data and for synthetic data and show in Figure [T]a comparison between them 
for 4 stocks chosen from the 20 stocks in the database. 

It is possible to note that our model is able to reproduce almost perfectly the autocorrelation behavior 
of these stocks. Note that each stock has its own best value for the parameter A. These results were 
already described in D'Amico & Petroni (2012b) for different stcks. We want to test here also if the 
bivariate model is able to reproduce the pairwise correlation (cross-correlation) between stocks still 
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Table 2. Cross-correlation matrix (multiplied by 100) for real data. 



preserving the autocorrelation functions. The definition of the cross-correlation between stocks a and 
j3 is: 

Cov(R a ,R B ) 

Z a3 = - 1 ^ -. (4.4) 
yJVar(R a )Var(Rp) 

From the real time series and from the synthetic ones we estimated the cross-correlation matrix for each 
couple of stock. Note that the matrix is symmetric with respect to stocks a and j3 . We then report in the 
table only a lower triangular matrix. 

From the two tables reported here it is possible to note that our bivariate model is able to reproduce 
more than 50% of the cross-correlation. In our view this is a good results given that the dependence 
between stocks is modeled in a very simple way. 



5. Concluding remarks 

With the aim to reproduce cross-correlation between stocks and following our previous works on uni- 
variate returns model, we have modeled financial price changes through a bivariate weighted indexed 
semi-Markov model. 

The results presented here show that the semi-Markov kernel is influenced by the past volatility 
and that its influence decrease exponentially with time. In fact, if the past volatility is used as an 
exponentially weighted index, the model is able to reproduce almost exactly the behavior of market 
returns: the returns generated by the model are uncorrelated while the square of returns present a long 
range correlation very similar to that of real data. Moreover the generalization to bivariate process, even 
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Table 3. Cross-correlation matrix (multiplied by 100) for synthetic data. 



if very simple, is able to reproduce more than 50% of the real cross-correlation. 

We stress that our model is very different from those of the ARCH/GARCH family. We do not model 
directly the volatility as a correlated process. We model returns and by considering the semi-Markov 
kernel conditioned by a weighted index the volatility correlation comes out freely. 
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